Flexible feature extraction and HMM design for a hybrid distributed speech recognition system in noisy environments
نویسندگان
چکیده
Using the client device of a distributed speech recognizer usually implies the presence of background noise since most scenarios for distributed speech recognition (DSR) are situated in a non-office environment. Thus, the general task is to choose the most suitable feature extraction method for the given conditions. We present a hybrid speech recognition approach implemented for DSR that allows the choice of arbitrary feature vectors (regarding number and range of value) without changing the amount of data sent to the recognition engine. Experiments were carried out using mel-cepstrum and RASTA-PLP features on the AURORA database. Results show how the recognition performance under different noise conditions can be adjusted if the different features are combined, and that our hybrid approach to DSR has advantages that could not that easily be obtained with traditional DSR architectures.
منابع مشابه
تشخیص لهجه های زبان فارسی از روی سیگنال گفتار با استفاده از روش های استخراج ویژگی کارآمد و ترکیب طبقه بندها
Speech recognition has achieved great improvements recently. However, robustness is still one of the big problems, e.g. performance of recognition fluctuates sharply depending on the speaker, especially when the speaker has strong accent and difference Accents dramatically decrease the accuracy of an ASR system. In this paper we apply three new methods of feature extraction including Spectral C...
متن کاملSpeech recognition with a new hybrid architecture combining neural networks and continuous HMM
Abstract. In this paper, we focus on a novel NN/HMM architecture for continuous speech recognition. The architecture incorporates a neural feature extraction to gain more discriminative feature vectors for the underlying HMM system. The feature extraction can be chosen either linear or non-linear and can incorporate recurrent connections. With this hybrid system, that is an extension of a state...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملRobust ASR using Support Vector Machines
The improved theoretical properties of Support Vector Machines with respect to other machine learning alternatives due to their max-margin training paradigm have led us to suggest them as a good technique for robust speech recognition. However, important shortcomings have had to be circumvented, the most important being the normalisation of the time duration of different realisations of the aco...
متن کامل